SUBMITTED TO IEEE TRANS. ON AUTOMATIC CONTROL 1 

The Caratheodory-Fejer-Pisarenko decomposition 
and its multivariable counterpart 

Tryphon T. Georgiou, Fellow, IEEE 



Abstract 

When a covariance matrix with a Toephtz structure is written as the sum of a singular one and a positive scalar 
multiple of the identity, the singular summand corresponds to the covariance of a purely deterministic component 
of a time-series whereas the identity corresponds to white noise — this is the Caratheodory-Fejer-Pisarenko (CFP) 
decomposition. In the present paper we study multivariable analogs for block-Toeplitz matrices as well as for 
matrices with the structure of state-covariances of finite-dimensional linear systems (which include block-Toeplitz 
ones). We characterize state-covariances which admit only a deterministic input power spectrum. We show that 
multivariable decomposition of a state-covariance in accordance with a "deterministic component + white noise" 
hypothesis for the input does not exist in general, and develop formulae for spectra corresponding to singular 
covariances via decomposing the contribution of the singular part. We consider replacing the "scalar multiple of 
the identity" in the CFP decomposition by a covariance of maximal trace which is admissible as a summand. 
The summand can be either (block-)diagonal corresponding to white noise or have a "short-range correlation 
structure" correponding to a moving average component. The trace represents the maximal variance/energy that 
can be accounted for by a process (e.g., noise) with the aforementioned structure at the input, and the optimal 
solution can be computed via convex optimization. The decomposition of covariances and spectra according to 
the range of their time-domain correlations is an alternative to the CFP-dictum with potentially great practical 
significance. 

Index Terms 

Multivariable time-series, spectral analysis, spectral estimation, central solution, Pisarenko harmonic decom- 
position, short-range correlation structure, moving average noise, convex optimization. 

I. Introduction 

PRESENT day signal processing is firmly rooted in the analysis and interpretation of second order 
statistics. In particular, the observation that singularities in covariance matrices reveal a deterministic 
linear dependence between observed quantities, forms the basis of a wide range of techniques, from 
Gauss' least squares to modem subspace methods in time-series analysis. In the present work we study 
the nature and origin of singularities in certain structured covariance matrices which arise in multivariable 
time-series. 

Historically, modem subspace methods (e.g., MUSIC, ESPRIT) can be traced to Pisarenko's harmonic 
decomposition and even earlier to a theorem by C. Caratheodory and L. Fejer on a canonical decomposition 
of finite Toeplitz matrices [15], [16], [19]. The Toeplitz structure characterizes covariances of stationary 
scalar time-series. Their multivariable counterpart, block-Toeplitz matrices, having a less stringent struc- 
ture, has received considerably less attention. The present work focuses on analogues of the Caratheodory- 
Fejer-Pisarenko (CFP) decomposition to finite block-Toeplitz matrices as well as to the more general setting 
of state-covariances of a known linear dynamical system. 

In Section ini we begin with background material on matrices with the structure of a state-covariance of 
a known linear dynamical system — block-Toeplitz matrices being a special case. Section |llll discusses the 
connection between covariance realization and analytic interpolation. Section|lV|presents a duality between 
left and right matricial Caratheodory interpolation and their relation to the time arrow in dynamical systems 
generating the state-process. Duality is taken up again in Section [V] where we study optimal prediction and 
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postdiction (i.e., prediction backwards in time) of a stochastic input based on state-covariance statistics. 
The variance of optimal prediction and postdiction errors coincide with left and right uncertainty radii in 
a Schur representation of the family of consistent spectra given in [11], [12] and elucidate the symmetry 
observed in these references. Further, Section [V] presents geometric conditions on the state-covariance 
for the input process to be deterministic and for the optimal predictor and postdictor to be uniquely 
defined. Vanishing of the variance of the optimal prediction or postdiction errors is shown in Section |VT| 
to characterize state-covariances for which the family of consistent input spectra is a singleton. 

Section |vn| gives a closed form expression for the power spectrum corresponding the "central solution" 
of [12]. This result extends the theory in [12] to the case where the state-covariance is singular. Natu- 
rally, the subject of this section has strong connections with the theory of Szego-Geronimus orthogonal 
polynomials and their multivariable counterparts [4]. In this section, we present yet another generalization 
of such polynomials as they now become matricial functions sharing the eigen-structure of the transfer 
function of the underlying dynamical system. Then, Section IVIIII explains how to isolate the deterministic 
component of the power spectrum via computation of relevant residues with matrix techniques. 

Section [K] shows, by way of example, that a state-covariance may not admit a decomposition into 
one corresponding to white-noise plus another corresponding to a deterministic input. To this end, a 
natural generalization of the CFP decomposition is to seek a maximal white-noise component at the 
input consistent with a given state-covariance. We explain how this is computed and discuss yet a further 
generalization where the input "noise" is allowed to have "short-range correlation structure". For instance, 
if the state-covariance is i x i (block-)Toeplitz, then we may seek to account for input noise whose auto- 
co variance vanishes after the k < £- moment — i.e., colored noise modeled by at most a /c-order moving 
average filter. In this way, a maximal amount of variance that may be due to short range correlations 
can be accounted for, leaving the remaining energy/variance to be attributed to periodic deterministic 
components and possibly, stochastic components with long range (longer than k) correlations. 

II. Structured covariance matrices 

Throughout we consider a multivariable, discrete-time, zero-mean, stochastic process 

{uk : k E Z} 

taking values in C™^^ with m G N. Thus, is to be thought of as a column vector. We denote by 

Rk := S{ueu*i_f,}, 

for k,i & Z, the sequence of matrix covariances and by dii(9) the corresponding matricial spectral measure 



for k eZ (see e.g., [18]). As usual, star (*) denotes the complex-conjugate transpose of, prime (') denotes 
the transpose, j := following the usual "engineering" convention, and £{■} denotes the expectation 
operator. Whenever star (*) is applied to a rational function of z it represents the para-conjugate Hermitian 
f{z)* := f*(z^^) where /*(■) refers to *-ing the coefficients of /(•) whereas the transformation of the 
argument is indicated separately. 

It is well-known that a covariance sequence 



for which 




{Ri : i eZ and R^e = R*^} 



is completely characterized by the non-negativity of the block-Toeplitz matrices 




(1) 



R~e 



Ro . 
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for all a.. That is, such an infinite sequence with the property that > 0, V£, qualifies as a covariance 
sequence of a stochastic process and vice versa. On the other hand, the infinite sequence of i?^'s defines 
the spectral measure dfi (up to an additive constant) and conversely. 

It is often the case that only a finite set of second-order statistics is available, and then, it is of interest 
to characterize possible extensions of the finite covariance sequence {Rq.Ri, . . . ,R(}, or equivalently, 
the totality of consistent spectral measures (see [4], [5], [6], [7], [2], [11], [12]). In general, these are no 
longer specified uniquely by the finite sequence {Rq.Ri, . . . ,R(]. In the present paper we are interested 
in particular, in the case where a finite set of second-order statistics such as {Rq, Ri, . . . , i?^} completely 
specifies the corresponding spectral measure (and hence, any possible infinite extension as well). We 
address this question in the more general setting of structured covariance matrices which includes block- 
Toeplitz matrices as a special case. 

A block-Toeplitz matrix such as given in ([T]) can be thought of as the state-covariance of the linear 
(discrete-time) dynamical system 

Xk = Axk-i + Buk, for k E 'Z. (2) 

where 





























Om 


A = 










,B = 



















with Om and the zero and the identity matrices of size m x m, A a {i + 1) x (C-\- 1) and i? a (^+ 1) x 1 
block matrices, respectively. The size of each block is m x m and hence the actual sizes of A, B are nxn 
and n X m, with n = {i + l)m, respectively. While for general state-matrices A, B the structure of the 
state-covariance may not be visually recognizable, it is advantageous, for both, economy of notation and 
generality, to develop the theory in such a general setting — the theory of block-Toeplitz matrices being a 
special case. 

Thus, henceforth, we consider an input-to-state dynamical system as in @ where 

ih) uk e C™, Xk e C", A G C"^", B e C"^"', 

^) rank(i?) = m, 

^) (A, B) is a reachable pair, and 

(j4|i) all the eigenvalues of A 

have modulus < 1. (4) 

Without loss of generality and for convenience we often assume that the pair {A, B) has been normalized 
as well so that 

(jl;) AA* + BB* = In. 

Conditions (H^-d) are standing assumptions throughout. Whenever condition (E^) is assumed valid, this 
will be stated explicitely. With Uk G C™, /c G Z, a zero-mean stationary stochastic process we denote by 

R := £{xkxl} 

the corresponding (stationary) state-covariance. The space of Hermitian nxn matrices will be denoted 
by EI„ C C"^" while positive (resp. nonegative) definiteness of an R G EI„ will be denoted by R > 
(resp. R > 0). Any state-covariance as above certainly satisfies both conditions, i.e., it is Hermitian and 
non-negative definite. The following statement characterizes the linear structure imposed by ©■ 
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Theorem 1: (see [11]): A nonnegative-definite Hermitian matrix R (i.e., E[„ ^ R > 0) arises as 
the (stationary) state-covariance of © for a suitable stationary input process {uk} if and only if the 
following equivalent conditions hold: 



^p.) rank 



R-ARA* B 

B* 

or, equivalently. 



2m, 



R - ARA* = BH + H*B* 

for some H e C"'^". (5) 

Proof: See [11, Theorems 1 & 2]. ■ 

A finite m x m non-negative matrix-valued measure dii{9) with 9 E (— vr, tt] represents the power 
spectrum of a stationary m x 1-vector-valued stochastic process. The class of all such m x m matrix- 
valued non-negative bounded measures will be denoted by M. Note that the size m is suppressed in the 
notation because it will be the same throughout. Starting with a stationary input Uk with power spectral 
distribution dfi E M, the state-covariance of Q can be expressed in the form of the integral (cf. [18, Ch. 
6]) 

R = (^G{e^s^^G{e^'y^ (6) 

where 

G{z) := {h-zA)-'B 

is the transfer function of (with z corresponding to the delay operator, so that "stability" corresponds 
to "analyticity in the open unit disc 3 := {z E C : \z\ < 1}"). Thus, either condition (jS^) or dSj?) in the 
above theorem characterizes the range of the mapping 

Mb dfi^R 

specified by Q. The family of power spectral distributions which satisfy Q will be denoted by 

Mr := {diJ,{e) E M : equation ® holds}. 

The above theorem states that this family is nonempty when R satisfies the stated conditions. Furthermore, 
a complete parametrization of Mr is given in [11], [12]. 

The present work explores the case where Mr is a singleton. The special case where Uk is scalar 
and R a Toeplitz matrix (but not "block- Toeplitz") goes back to the work of Caratheodory and Fejer 
a century ago, and later on, to the work of Pisarenko (see [15], [16], [19]). In the scalar case, Mr is 
a singleton if and only if R is singular (and of course non-negative definite). Then Uk is deterministic 
with a spectral distribution dfi having at most n — 1 discontinuities (spectral lines). In the present paper 
we obtain analogous results when R is a state-covariance and Mr is a singleton, and then we study 
decomposition of a general R > into a covariance due to "noise" plus a singular covariance with 
deterministic components — in the spirit of the CFP decomposition of Toeplitz covariance matrices. 



III. Connection with analytic interpolation 

The early work of Caratheodory and Fejer was motivated by questions in analysis which led to the 
development of analytic interpolation theory — a subject which has since attained an important place in 
operator theory, and more recently, closer to home, in robust control engineering. We review certain 
rudimentary facts and establish notation. 
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A non-negative measure /i G M specifies an m x m matrix-valued function 

=: n[dfi]+jc (7) 

with jc an arbitrary skew-Hermitian constant (i.e., c G ]HI,„), which is analytic in the open unit disc D 
and has non-negative definite Hermitian part (see, e.g., [6, page 36]). We denote by H[dfj] the Herglotz 
integral given in previous line. The class of such m x m functions with non-negative Hermitian part in 
D, herein denoted by 

F := {F{z) : F{z) =n[dij]+jc 
with c G Elm and /i G M}, 

is named after Caratheodory and often referred to simply as "positive-real". Conversely, given F G F, a 
corresponding dfi(9) can be recovered by the radial (weak) limits of the Hermitian part of F{z); 

dii{e) = limHerm{F(re^'^)}. (8) 

In fact these two families, F and M, are in exact correspondence via Q and dU) (assuming that elements in 
F are identified if they only differ by a skew-Hermitian constant and, similarly, non-decreasing distribution 
functions /i are defined up to an arbitrary additive constant). 

Given (A, B) as above, let C G C"^", D G C™^" be selected so that 

V{z) := D + zC{In - zA)-^B (9) 

is inner, i.e., V{^yV{^) = /,„ for all |^| = 1. Since V{z) is square, V{^)V{^y = as well. If the 
normalization (H^) is in place, the condition on (C, D) for V{z) to be inner is simply that 



U :-- 



A B 
C D 



is a unitary matrix. The rows of G{z) form a basis of 

where I-L2 denotes the Hardy space of functions analytic in D with square-integrable boundary limits. This 
can be easily seen from the identity [11, equation (38)] 

G{z) = {zl - A*)-^C*V{z) (10) 

(from which it follows that the entries of G{z)V{z)* are in 7i^, the orthogonal complement of 0.2 in the 
Lebesgue space of square-integrable function on the unit circle £2(0!]))). 

Now let dfi{9) represent the power spectrum of the input to Q, R the corresponding state-covariance, 
and F{z) obtained via Then, R turns out to be the Hermitian part of the operator 

W : /C ^ /C : u{z) ^ {iy{z)F{zy) , (11) 

with respect to basis elements being the rows of G{z), where Il/c denotes the orthogonal projection onto 
/C (see [11, equations (40-41)]). Of course, R is also the Grammian 

{G{z),G{z)),, 

with respect to the inner product 



2tt 



{g,{z),g,{z)),,:= / [ g,(e^^)J^g,(e^<^) 



dfi{9) 
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This is in fact the content of ©. 

The relationship between F{z) and R can be obtained by way of W. If H is the zeroth Fourier 
coefficient of G{z)F{z)* then the matrix representation W for W with respect to the rows of G{z) 
satisfies (see [11]) 

W - AW A* = H*B* (12) 
leading to (|51) for R = + W*. The matrices W or H completely specify IljcF{z)*\fc and in fact 

F{z) = Fo{z) + Q{z)V{z) (13) 

with 

Fo{z) ■.= H{h-zA)-^B 

and Q{z) is a matrix-valued function which is analytic in D. Conversely, if F{z) e F and satisfies (fT^ . 
then it gives rise via dHJ) to a measure which is consistent with the state-covariance R. 

Equation (fT^ specifies a problem which is akin to the Nehari problem encountered in Tioo -control 
theory, but involves interpolation with positive-real functions instead of functions in 7ioo(ID')- Some of 
the early work in analytic interpolation focused on conditions in terms of interpolating values F{zi) at 
specified points G D (i = 1, . . . , n) which guarantee the existence of a scalar F{z) G F. Invariably, the 
conditions involve the non-negativity of the so-called Pick matrix. In the current setting the corresponding 
Pick matrix is non other than R (see [11], [12]). For further references and trends in literature on analytic 
interpolation see [7], [1]. 



IV. A dual formalism 
Using (flOb . equation Q can be rewritten as 

R = ^'^(a(e^-^)^a(e^-^)*), (14) 

where 

Gr{z) = {zI-A*)-^G* 

and 

dfir{9) = V{e^'^)dfi{e)V{e^y. (15) 
The rows of Gr{z), for z = e^^, span a subspace of (hI^"^)'^ which we denote by 

JC, := {nl''"')^ e {nl''"')^v{zy. 

The notation ^ denotes orthogonal complement in the "ambient" space — here £2(<9ID))^^™. It readily 
follows that a state-covariance of Q satisfies a set of dual conditions given below. 

Theorem 2: A nonnegative-definite Hermitian matrix R g C"""" arises as the (stationary) state- 
covariance of © for a suitable stationary Input process Uk if and only if the following equivalent 
conditions hold: 



rank 



2m, 



R-A*RA G* 
G* 

or, equivalently, 

(P) R - A*YiA = G*V + LG 
for some L e C"^™ 

and G selected as in Section [ffl] (i.e., so that D + zG{In - zAy^B is inner). Conditions dSt-d) are 
also equivalent to conditions ^-b). 
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It is noted that, rank(_B) = m in condition (EJ?) implies that rank(C) = m as well. To see this, assume 
without loss of generality that holds. Then B*B = Im - D*D > which implies that \\D\\ < 1. 
Using once more unitarity of U and the fact that \\D\\ < 1, we obtain that CC* = Im — DD* > which 
implies that rank(C) = m. 

An insightful derivation of Theorem El can be obtained by considering © under time-reversal. More 
specifically, we compare the state-equations for dynamical systems with transfer functions V{z) = D + 
Cz{In - zA)-^B and V{zY = D* + B*{zln - A*)-^C* given below: 

Xk = Axk-i + Buk 

Vk = Cxk-i + Duk, k = ... ,-1,0,1, .. . (16) 

and 

Xk-i = A*Xk + C*yk 

Uk = B*Xk + D*yk, k = ...,1,0,-1,.... (17) 

Both are interpreted as stable linear dynamical systems but with opposite time-arrows. Since V{z)V{z)* = 
Im, the input to one of the two corresponds to the output of the other, and (fT5l relates the spectral measure 
dfj. of {uk} to the spectral measure dfir of {Vk}- The state-covariance for both system is the same when the 
first is driven by {uk} and the second by {yk}, respectively. Thus, if R = E{xkxl}, Theorem [T] applied 
to (fT6b leads to (|5t-b) while, applied to (fTTt . leads to ©-d). The spectral measures of the respective 
inputs {uk} and {yk} relate as in ([131 . 

Proof: [Theorem^ Follows readily from the above arguments. More precisely, R is a state-covariance 
of Q for a suitable stationary input process {uk} if and only if it is also a state-covariance of 

xe+i = A*Xi + C*yi 

for a suitable stationary input process {yi, £ E Z}. Then applying Theorem [T] we draw the required 
conclusion. ■ 
An analogous dual interpolation problem ensues. To avoid repeat of the development in [11], [12], we 
may simply rewrite (fT4l) as 

R' = J^'^ (^(G,(e^-^)*)'^^^^MG,(ei'')'^ 
where now the left integration kernel is 

{Grizyy = {z-'i„, - A'r'c. 

Note that R' = R 7^ R in general, since R is Hermitian but may not be symmetric — where bar ( ) 
denotes complex-conjugation. Trading a factor z between the left integration kernel and its para-hermitian 
conjugate on the right we obtain that 

R' = /'Vn - e^'A'r'C'^^^^^^C{In - e-^'A)-' 
Jo 27r 

leading to the analytic interpolation problem of seeking an F-function of the form 

L'iln-zA'y'C + Q{z)V{zy. 
Transposing once more we may define 

Friz) = C{In - zA)-^L + V{z)Q{z) 
and draw the following conclusions. 
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Theorem 3: Let V{z) = D + Cz{In - zAy^B be an m x m inner function with {A, C) observable 
and {A,B) reachable. If L g C"""™ and R the solution to then there exists a solution H to 
equation Conversely, W H e c™""" and R the solution to then there exists a solution L to 
equation With R, L, H related via ^) and (jStl), the following are equivalent: 

(FfSb ) R > 0, 

F{z) = H{I - zA)-'B + Q{z)Viz), 
with analytic in D, 

with Qr{z) analytic in D. (18) 

Proof: Begin with L G C"^™ and R the solution to (|5]l). If R > then R is a state-covariance 
to © according to Theorem El and hence, there exists a solution iJ to equation To argue the case 
where R may not be nonnegative definite necessarily, consider without loss of generality condition (H^) 
valid and that 

'A B 
C D 



U 



is unitary. Then, /„ - A* A = C*C and /„ - AA* = BB*. If R is the solution to (|5]l) for a given L, 
then Re := R + tin is the solution of the same equation when L is replaced by L^:= L + We can 
always choose e so that R^ > and then deduce that there exists a solution to 

R, - A'R.A* = BH, + H*B*. 

Since /„ = AA* + BB*, H := — ^B* now satisfies (|5j5). The converse proceeds in the same way. 

The equivalence of dTSh ) and (fTSb) follows as in [1 1]. If R > 0, then (fTSb) follows from [11, Theorem 
2]. Conversely, if (fTSb) holds, then W (defined in (fTTTl satisfies (fT2b leading to R being its Hermitian 
part. Since G F, the Hermitian part of multiplication by F{z)* is nonnegative, and hence it remains 
so when restricted to the subspace /C. 

The dual statement (fTSb ) follows in an analogous manner. ■ 
Remark 1: If V{z) = Vi{z)V2{z) is a factorization of V{z) into a product of inner factors, then it 
can similarly be shown that the conditions dTSb of the theorem are equivalent to the solvability of a 
bi-tangential Caratheodory-Fejer interpolation problem of seeking an Fo{z) G F where Fo{z) = Ho{I — 
zA)~^Lo + V2{z)Q{z)Vi{z) for suitable Ho, Lo. (The Ho, Lq can be computed from e.g., H, B by setting 
Fo{z) as the analytic part of V2{z)F{z)V2{z)* and F{z) as in (Hlb).) □ 

V. Optimal prediction & postdiction errors 

A spectral distribution G M induces a Gram matricial structure on the space of p x m matrix-valued 
functions on the circle (see [18, pages 353, 361]) via 

{a{z),b{z))d,:= r h{e^')'^a{e^')* (19) 



JO 27r 

=£{(Y^heUk-ii)(Y^ai,Uk-ty} (20) 

where a^,6^ are the Laurent coeffients of a{z), b(z), respectively. The correspondence 

aez^ ^ ^ aiUk-e, (21) 
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h{z) = hiz^ (22) 



between functions on the unit circle (taking z = e^^) and linear combinations of the random vectors {uk}, 
leaves the respective Gram-matricial inner products in agreement and establishes a natural isomorphism 
between C2{dB)] dfi) and the space spanned by (the closure of) linear combination {u^} (see Masani [18, 
Sections 5, 6], cf. [12]). 
Any matrix-valued function 

with entries in 'H2 and 
corresponds via (ETT) to 

h{z) Uk - Mfclpast 

which is interpreted as a "one-step-ahead prediction error". Likewise, if the entries of 



1=0 



hiO) = ho = Im (23) 



h{z) = hez^ (24) 

live in z'H2 and h{0) = 



h{z) V-^Uk- Mfc|future 

corresponds to "one-step-ahead postdiction error", i.e., using "future" observations only to determine the 
"present". Occasionally we may refer to these for emphasis as prediction forward, and backwards in time, 
respectively. Either way, the "estimator", which may not be optimal in any particular way, is the respective 
linear combination of values of for A; ^ 0: 

^fc|observation range • ^ ^ h(Uk—t- 

(When the values extend in both directions it is a case of smoothing and is needed to interpret the 
F-function in Remark [T] — this will be developed in a forthcoming report.) 

We first discuss prediction in the forward direction. Throughout we consider as data the covariance 
matrix R and the filter parameters. We assume that d/j E Mr but otherwise unkown. Because dfx is not 
known outside /C, it can be shown that the min-max problem of identifying the forward prediction error 
with the least variance over all djj. E Mr has a solution which lies in /C. To this end we seek an element 
in /C™, i.e., an m x n matrix-valued function 

rG{z) with r E c™^" 

with rows in /C, having least variance 

{rG{z),rG{z)),^ = r{G{z),G{z))d,r 
= rRr\ 

and subject to the constraint (l23t which becomes 

rB = I. (25) 

Existence and characterization of minimizing matrices F is discussed next. 

Nonnegative definiteness of the difference — > between two elements i7j E (i = 1,2) 
defines a partial order Oi > in Hm. An Hm-valued function on a linear space is said to be Mm-convex 
iff 

/(aA + (1 - a)r2) < af{r,) + (1 - a)/(r2), 
for a E [0, 1]. 
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It is rather straightforward to check that if R > 0, then the quadratic 

. (^mxn ^ jj^^ . r = rnr, (26) 

is in fact Hm-convex. This basic fact ensures existence of EI,„-minimizers satisfying (l25t in the proposition 
given below. Note that the statements (ii) and (iii) of the proposition are rephrased in alternative ways 
(e.g., (ii-a), etc.) in order to highlight an apparent symmetry when expressed in terms of directed gaps 5 
(defined in the statement of the proposition) between the null space 

Ar(R) := {x e C"^^ : Rx = 0„} 

of R and the range 

n{B) ■= {x e C"^^ : x = Bvforve C"^^} 

of B — the gap metric represents an angular distance between subspaces and is a standard tool in pertur- 
bation theory of linear operators (see [17]) and in robust control (e.g., see [13]). 

Proposition 1: Let B G C"^™ having rank m, and let R g IHI„ with R > 0. The following hold: 

(i) There exists an HI,„-minimizer of qn satisfying (|25|. 

(ii) The minimizer is unique if and only if 

rank( [ R B ]) = n. 
(ii-a) The minimizer is unique if and only if 

5(A^(R),7^(5)) := ||n^(B)xU(R)|| < 1. 

(iii) The H^-minimal value for qn is Om if and only if 

5*nAr(R)5 is invertible. 
(iii-a) The H^-minimal value for q^ is Om if and only if 

5(7^(S),Ar(R))) := ||n^(R)x|7e(B)|| < 1. 

(iv) If rank(R) = n, then the H^-minimal value of qn is 

n ■= {B*R-^B)-^ > 

and a minimizer (unique by (ii)) is 

r = {B*R-^B)-^B*R-\ 

(v) If the HI„-minimal value of qn is 
then a minimizer is given by 

r = (5*n^(R)5)-^S*nAr(R). (27) 

(vi) In general, when R is singular, the H^-minimal value for q^ is 

VL = (5*R«5i)« (28) 

and a minimizer is given by 

r = (s*n^(R)5)«s*n^(R) (29) 
+(5*R»fii)«E*R«(/„ - fi(fi*n^(R)5)»5*n^(R)) 
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where denotes the Moore-Penrose pseudo-inverse of R, 



Alternatively, 



5i := Bn^^, and 

Xo := Ar(5*n^(R)5) 



Q = lim 
r = lim 



(30) 



where 



R. 



nj^B*R~^ 

R + en^(R) 



(31) 



and e > 0. 

Proof: Claim (i): Since qniF) > 0, then for any a G [0, 1] and any Fi, r2 

«(1 - «)gR(A - Ta) > 

^ a{l - a) (ARr* + TaRr; 
-ARr* - ARA*) > 

^ aARA* + (i-")^2Rr2 

-a^ARr* - (1 - a)2ARA* 

- a)ARA* - "(1 - a)ARA* > 
agR(A) + (1 - a)gR(A) 
>gR(«A + (l-a)A). 

This proves -convexity of q^. It is also clear that qn is bounded below by Om- However, ^r is not 
necessarily radially unbounded when R is singular. Hence, we need to consider components of T which 
lie in A/'(R). Any T satisfying ( E51) is of the form 

r = To + XM 

where Tq is a particular solution of ^ (e.g., Tq = {B*B)-^B*\ the rows of M e C^""")''" span the 
left null space of B, and X is an arbitrary element of Substituting into gR we obtain 



gR(ro + XM) = XQX* + XL + L*X* + C, 



(32) 



with Q = MRM*, L = MRT*^, and C = ToRr^, which is also H^-convex in X. Since it is bounded 
below by Om, it follows that the null space of Q is contained in the null space of L*. Expressing the 
entries in (l32b with respect to the decomposition C" = 7^(Q) © A/'(Q), we may write gR(ro + XM) in 
the form 



[Xi X2] 
+ [X, X2] 



Q 



0(„ 



o 



mx{n—m) 



{n—m)xm 





'x* 




XI 



Li 



+ [LJ Om] 



XI 
X* 



+ c, 



(33) 



where Qi > 0. This expression is radially unbounded in Xi and hence, a minimizer exists (taking any 
bounded value for X2, e.g., X2 = Om)- This proves (i). 
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Claim (ii): Because all minimizers satisfy ( E51 . any two of them differ by some matrix, say A such 
that AB = Orn- Hence if qn{r) = q^i^r + A), then qnir) = qu,{r + eA), for e G [0, 1]. This is due to 
the -convexity of q^. Therefore 

e^ARA* + erRA* + eARr* = Om, 

identically for all e G [0, 1]. It follows that there is more than one minimizer if and only if there exists a 
common left null vector for both B and R (which serves as a nonzero row of A, so that A ^ Omxn)- 
This proves (ii). 

Claim (ii-a): For the definition of the directed gap in (ii-a) cf. [17]. The claim that (ii-a) is equivalent 
to (ii) is standard. Since 1Z{B)-^ coincides with J\f{B*), a common element between J\f{B*) and A/'(R) 
would lead to ||n7^(^)x |;v'(r)|| = 1- Since we are dealing with finite-dimensiional spaces the converse 
is immediate — a common vector is the only way the norm can be equal to one in this case. The rank 
condition in (ii) is obviously equivalent to J\f{B*) fl A/'(R) = {0}. 

Claim (iii) and claim (v): We now argue claim (iii) together with claim (v). If B*I1^(^r)B is invertible 
and r as in dTTl . then FB = 1^ and qnir) = Om- To show the converse assume that 3r such that 
Qnir) = Om as well as FB = /„. Then the columns of F* belong to A/'(R) and rn^(R) = F. Therefore 
1^11^(^)5 = Im and 11^(^)5 has rank m. Consequently, B*I1^(^r)B is invertible. 

Claim (iii-a): The equivalence of (iii-a) and (iii) is standard. Condition (iii-a) is equivalent to stating 
that nAr(R)k(B) has rank m. But Ilj^{B,)\n{B) + n7^(R)|7^(B) = Iti{b) (where Iti{b) denotes the identity 
operator on 1Z{B)). Because R G EI„, A/'(R)"'" = 7^(R), and condition (iii) follows. 

Claim (iv): Assume that R is positive definite and F, Vt as in (iv). Then VB = Im and q'R(F) = > 0. 
For any X G C'"''" such that XB = it can be readily seen that gR(r + X) = n + XRX* > n. 
Hence, the minimizer and minimal value are as claimed. This proves (iv). 

Claim (vi): Denote 

Uo := 7^(5*^^(R)S) 

and recall that, for any matrix M, the orthogonal projection onto its range can be obtained via 

^7^(M) = M«M. 

We verify by direct substitution that 

TB = (5*n^r(R)5)tt5*n^(R)5 + (5*R«5i)»(5i*R"5i) 

= U^^^ + nM^ (34) 

and that 

TRY* = (5*R«5i)»(fi*R»fii)(5*R«fii)» 
= Q 

as given in (vi). Step ( O^t needs the fact that 

7^(5*R»5l) = K. 

We can show this as follows. Clearly 7l{BlK^Bi) C A/'o since B^ = IIj^^B*. To establish equality we 
need to show that there exists no x G Mo other than such that R^Bx = 0, i.e., that 

Ar(5*i?«5) nTV; = {0}. (35) 

But 

^^{B*R^B)=^^{B*n^^in)B) 
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and 

b*iIti(r)B + B*iix(^^')B = i?*(n7^(R,) + n_^(R,))i? 

= B*B 

is invertible. Hence (l35t holds and so does (l34b . 

We finally need to show that the value for Q is an Hm-minimum of qn subject to (l25t . For any X such 
that XB = Om it also holds that XBi = Om- We can verify by direct substitution that 

q^(T + X) = n + XKX*, 

which proves that F as given represent the minimum and minimizer, respectively. 

We argue the validity of the alternative set of expressions ( I30H31I) as follows. For any e G (0, 1], 

R, = R + en^(R) (36) 

is positive definite with 

R;1 = Rtt + e-in^(R) 

as its inverse. We can now apply (iv) to argue that fi^, F^ are the minimal value and minimizer of qn^ 
subject to Fe-B = Im, as before. It follows that their limits satisfy TB = Im and FRF* = Q. Then Q is 
indeed the Hm-minimal value of qu (cf. (i)) by continuity of q^ on R. 

It is straightforward (but a bit cumbersome to typeset) to use the limits ( BUt and verify ( I28H29I) . To 
pursue this, express B*Il^^B as a 2 x 2 matrix with respect to the decomposition 

= Ar„ © 7^o. 

The (1,1) entry, a := Ilj^^B*Il^B\_^^ is invertible and so is the (2,2) entry 

Un^B*K^B\^^^ + e''B*n^(^R)B =: 7 + e~'6 

where 7, S are defined to represent the respective terms. The (2,2) entry is the only one involving the 
parameter e. Then, the inverse of B*Il^^B becomes 

a-i + o(e) -a-^f36-h + o{e'^) 



with P := IIj^^B*II^B\ti^. The limit gives the correct expression for ^l. The limit of F^ as e — * can be 
carried out similarly. 

■ 

Remark 2: It should be noted that R is not required to have the structure of a state-covariance of 
a reachable pair (A, B) (cf. Theorem [l]) since the matrix A does not enter at all in the statement of 
Proposition[T]. However, if this is the case (see Proposition |2l below) and R is a singular state-covariance , 
then VL is singular as well — a converse to the first part of statement (iv). □ 

For prediction backwards in time, the postdiction error 

00 

Uk — Uk\future = Wfc — hiUk+l (37) 

e=i 

corresponds to an element 

zL*Gr{z) = L*{In - Z-^A*)-^C* G zlCr 

with L G C"^". 

The constraint arising from the the identity in front of Uk in (l37t . translates into 

— -'m 
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while the variance of the postdiction error becomes 

L*RL. 

Proposition [l] applies verbatim and yields that: 
(i') there exists an -minimal postdiction error, 
(ii') The minimizer is unique if and only if 



R 

C 



n. 



rank( 

(iii') The variance of optimal postdiction error is equal to Om if and only if 

Cn^(R)C* is invertible. 

(iv') If rank(R) = n, then the variance of the optimal postdiction error is (strictly) positive definite and 
the unique minimizer is 

r, = R-^c*(CR-ic*)"^ 

(v') If the variance of the optimal postdiction error is equal to 0^, then a (non-unique) minimizer is 

Tr = n_v(R)C*(cn_^(R)C*)~^ (38) 

Similarly, the analog of (vi) holds as well. 

Remark 3: It is interesting to point out that the square-roots of the variances of prediction and post- 
diction errors (i?*R^^i?)^^/^ and (CR^^C*)"^/^ appear as left and right radii, respectively, in a Schur 
parametrization of the elements of Mr, in [11] (cf. [12, Remark 2]) and that, in view of the above, if one 
is zero so is the other. □ 



VI. When Mr contains a single element 

We now focus on the case where Mr consists of a single element, we analyze the nature of this unique 
power spectrum, and study ways to decompose R into a sum of two non-negative definite matrices, one 
of which has this property and another which may be interpreted as corresponding to noise. Conditions 
for Mr to be a singleton are stated next. 

Theorem 4: Let A, B satisfy ^ and R > for which ^ holds. Then, the set Mr is a singleton if 
and only if the following equivalent conditions hold: 

S) 5(7^(5),A^(R)) < 1, 

M ) 5*njv-(R)5 is invertible. (39) 

If (C, D) are selected so that V{z) in i|9| is inner, the above conditions are also equivalent to: 

dSSt) 5{TZ{C*),M{K)) < 1, 
M ) Cn^^(R)C* is invertible. 

Proof: As explained earlier, an element rf/i e M defines via dTJ) an F-function F{z) = Hldjj] which, 
in turn, defines a (possibly unbounded) non-negative operator on Hl^^^ via 

Conversely, this operator defines uniquely the function F{z) G F as well as the corresponding measure 
djj E M. (except of course for a skew-Hermitian constant in F{z) and an additive constant in fi). The 
restriction onto /C, 

W : /C /C : x{z) ^ U^x{z)F{zy 
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corresponds to "one half" of R as in (fT^ . and is specified by R (modulo a skew-Hermitian part). We 
proceed to show recursively that there exists a unique extension of W to a non-negative operator on 

for £ = 1, 2, . . ., and hence, to a non-negative operator on 1-L\^"\ 
Consider the representation 

zV{z) =Di + Ciz{I - zAiY^Bi 



with 



^1 



A 
C 

B 
D 

[0 /] 
0, 



(40) 



and 



Ri 



R Ri2 

Rl2 1^22 

the (non-negative) Hermitian part of an extension of W into /Ci. Then, from Theorem [H 



where 



Ri - AiRiAl = B{Hi + n\B* 



(41) 



(42) 



Let us first assume that (0^) holds (and hence, from Proposition [H that (O^-d) hold as well). Then 
FR = Omxn with r as in dTTt satisfying TB = Im- Because, Ri > 0, it follows that 



TRi 



otherwise it would be possible to render the quadratic form arRi2 + alll2^* + |apR22 indefinite with 
a suitable choice of a G C which would contradict Ri > 0. From (BTl) on the other hand, we have that 

Ri2 - ARC* = BHi + H*D*. 

Multiplying on the left and the right by F and B, respectively, we conclude that 

Hi = -FARC* - TH*D* 

is uniquely defined from the original data — hence, so is the "one-step" extension Ri of R. It remains to 
show that the condition (l39k) is still valid for the new data, i.e., that 

Bl'nM(Ri)Bi 

is also invertible. Since R is Hermitian, 

C" = A/'(R) © 7^(R) 

is an orthogonal decomposition. Then, the null space of Ri is the orthogonal direct sum of 



and 



n 



•R,(R) 



X 



: X G C", y G C'", Ri^ = 0}. 



16 



SUBMITTED TO IEEE TRANS. ON AUTOMATIC CONTROL 



Denote these two subspaces by A^i and respectively. Then, 

nAr(Ri) = n^/-^ + nA/-2 

where 
So, finally, 

i3tn^(R0^i = 5*n^^(R)5 + bihm.Bi > o, 

because 5*11^(^)5 is already positive definite. This completes the proof. ■ 
The unique element in Mr under the conditions of the theorem can be obtained, in principle, after 
extending R recursively for i = 1,2,... using (I40H42I) . This specifies a non-negative operator on a dense 
subset of 7-^2^™ which, in turn, specifies a corresponding positive real function F{z) and the measure 
can be obtained from the boundary limits of the real part of F(z) as a weak limit. However, an explicit 
expression for F(z) will also be given later on. Before we do this we explain some of the properties of 
this unique measure. 

The following result states that dfi is a singular measure with at most n = m points of increase, i.e., 
at most n — m spectral lines whose directionality is encapsulated in suitably chosen unitary factors. The 
spectral lines are in fact at the zeros of certain matrix-valued functions, namely 

^{z) ■=VG{z) = V{In- zA)-^B (43) 

and r as in Proposition [U which correspond to the optimal prediction error and represent the analog of 
the Szego-Geronimus orthogonal polynomials of the first kind, cf. [12]. 

Theorem 5: Under the assumptions and conditions of Theorem 01 the unique element in Mr is 
of the form 

d^{d) = Y,yiPiyid\]{e-e,) 

i=l 

where 

1 

6i e [0,2tt) ior i = 1, . . . ,q differ from one another, l]{6 - 6e) denotes a unit step at 6i, and pe > 0. 
The values e^^^ for £ = 1, . . . , g are the non-zero eigenvalues of the matrix (/„ - Br)A with F as in 
^27\. The matrices Ve are chosen so that 

and can be normalized to satisfy VgV^* = I as well as to make pe diagonal. 

Proof: Under the stated conditions, Mr is a singleton from the previous theorem and its unique 
element dfi satisfies 

/ " me^')dfx{eMe^'r) = TRY* = O^^^ (44) 
Jo 

with r as in (l27t . It readily follows that dp. can have points of increase only at the finitely many points 
Oi, i = 1, . . . , q, where $(e-'^) is singular. The "zeros" of ^(z) coincide with the "poles" of its inverse 

<l>(^)-^ = /„ - TA{In - zAoY^B (45) 

where 

Ao = {In - BT)A. (46) 



GEORGIOU: THE CARATHEODORY-FEJER-PISARENKO DECOMPOSITION 



17 



Since Ao has already m eigenvalues at the origin, the number of eigenvalues that it may have on the 
circle is at most n — m. Thus 

e=i 

where Me eMm, Mi>0, and 

Exressing = VipeYl with p^, Vi as claimed is standard. This completes the proof. ■ 
Thus, Mr being a singleton implies just as in the classical scalar case (e.g., [19], [15]) that the under- 
lying stochastic process is deterministic with finitely many complex exponential components. Subspace 
identification techniques represent different ways to identify "dominant ones" and obtain the "residue" pi 
that corresponds to each of those modes (see [15], [19], [9], [10]). In the present multivariable setting, 
in order to do something analogous, we need an explicit expression for the corresponding positive real 
function. This is done in the next section. 

Remark 4: A dual version of the representation in Theorem |5] gives that 6^ correspond to "zeros" on 
the circle of the optimal postdictor error 

L*{zln- A*)-^C\ 

Similarly, the range lZ{Ve), for £ = 1,2, ... ,g, is contained in the correspond null space of the above 
postdiction error when evaluated at the corresponding zeros. □ 

Remark 5: The "star" of the optimal postiction error can also be interpreted as a "right matricial 
orthogonal polynomial of the first kind" 

^r{z) = C{In-zA)-^L. (47) 

These matricial functions, i.e., $(2;) and ^r{z), together with their counterparts of the "second kind" '^{z) 
and '^r{z) that will be introduced in the next section, satisfy a number of interesting properties similar 
to those of the classical orthogonal polynomials [14] (cf. [4], [5]). We plan to develop this subject in a 
separate future publication. □ 

VIL The "central" positive real function 

With R, A, B, H satisfying (|5j5) in Theorem El we define 

Fme{z) := ^z)-'^{z) (48) 
where $(2;) = r(/„ - zA)'^B as before, 

<I/{z) := -r^(/„ - zA)-^AH* + D^, (49) 

and 

:= -r{H*B* - R)B{B*By\ (50) 
By eliminating the unobservable dynamics in the expression for Fme{z) we obtain 

FMEiz) = D^ + zCo{In-Ao)-'Bo (51) 

where 

Co := -TA 

Ao := {In-BT)A 

Bo := BD^ + H*. (52) 
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In case R > 0, Fme{z) is the positive-real functions which corresponds to the "maximum entropy" 
spectral measure dfiME{d) ^ Mr, i.e., the unique element of Mr which maximizes the entropy functional 

lifi) := / \ogdet{fi{e))de. 
Jo 

This element was identified in [12] as 

dfiME = He^T'^ me^'r'YdO, (53) 

without drawing the connection to ( BTT ). However, Fme{z) in (B51) is defined even when R is singular, in 
which case the corresponding measure may have a singular part obtained as the weak radial limit of the 
Hermitian part of Fme{z) 

dfiMEiO) = limRerm {FMEire^'^)}de. 

The singular part, which corresponds to purely deterministic components in the underlying time series, 
relates to the residues of Fme{z) at corresponding poles on the unit circle. This allows identifying spectral 
lines directly from Fme{z). It should be emphasized that (l53t is no longer valid in the case of a singular 
R. We first establish the claim that Fme is positive real and that it is consistent with R. 

Theorem 6: Let R, A, B, H satisfy ^) of Theorem El r given as in (j^HJ, and Fme{z) given as in 
Then 

(i) Fme{z) satisfies (EH), and 

(ii) Fme{z) G F. 

Proof: Condition ( fT^ is equivalent to 

^{z)V{z)* - <!>{z)HG{z)V{zy = ^z)Q{z). 

To show that this relationship holds for some Q{z) analytic in D, it suffices to show that all negative 
Fourier coefficients of 

- <^{z)HG{z)V{z)* (54) 

vanish. By collecting positive and negative powers of z we can express 

^{z)V{zy = {D^B* ~rAWA*){zIn- A*)-^C* 
+D^D* - TA{In - zA)-\zH*D* + WC*), 

and similarly that 

<^{z)HG{z)V{zy = rw*{zin- A*)-^c* 

+r(/„ - zA)-^wc* 

where W is given by (fT^ . Thus, negative powers of z in (I5?l) sum up into 

{D^B* - T{AWA* + W*)) {zin - A*)-^C*. 

Thus, to prove our claim (and because {A*, C*) is reachable), we need to show that 

D^B* - T{AWA* + W*) 

vanishes. Substituting the value for from ( l5Dt in the above the expression we get 

-T{H*B* - RB{B*B)-^B* + AW A* + W* 
= -T{W -RB{B*B)-^B* + W*) 
= -TR{In- B{B*B)-^B*). 
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Recall that T = lim^^o T^, from the proof of Proposition [H while satisfies 
Thus 

r,(R + enAr(R))(/n - B{B*B)-'B*) = 

identically for all e, and hence, taking the limit as e ^ we get the desired conclusion. This completes 
the proof of claim (i). 

We first argue that Fme{z) is analytic in D. Of course, '^{z) is already analytic in D by our standing 
assumption on the location of the eigenvalues of A. (Its poles cancel with the corresponding zeros of 
anyway.) We only need to consider $(2)^^ If R is invertible, then $(2;)^^ has no poles in D by 
[12, Proposition 1]. If R is singular, then, once again, we consider 

Re = R + en;v^(R,), with e > 0. 

With fi, = (5*R7^5)-i and V, = n,B*R;^ as before we define <^,{z) := T,G{z) and apply [12, 
Proposition 1] to deduce that $^(2;)"^ is analytic in the closed unit disc, for all e > 0. By continuity, 
$(2;) has no poles in the open unit disc. Similarly, the Hermitian part of Fme{z) in D is the limit of the 
Hermitian part of 

FME,e{z) := 

where '^e{z) is given by (R^ with F, R replaced by F^, R^, respectively. A matricial version of a classical 
identity between orthogonal polynomials (of first and second kind [14, equation (1.17)]) holds here as 
well: 

m{z)^{zy + ^{z)^{zy = n. (55) 

To verify this, after standard algebraic re-arrangement, the left hand side becomes 

Ao + zT{In - zA)-'B+ + z-'Bliln - z~'A*)-'r 

where 

= A{BD*^ - RF* + BHT*), and 
Ao = + -TAKA*T*. 

If R is invertible it is straightforward to show that 

BDl - RT* + BHT* = On,m 

while 

Ao = {B*R-^B)-^ = n. 

If R is singular then, as usual, we replace F, R by their e-perturbations and claim the same identities for 
the relevant limits. This shows that FME,e{z) E ¥ for all e > 0. Hence, so is Fme{z) since it is analytic 
in © and its Hermitian part is nonnegative being the limit of the Hermitian part of FME,e{z) as e — 0. ■ 
Remark 6: The relationship (l55t (cf. [14, equation (1.17)]) between matricial functions of the "first" 
and "second-kind" generalizes to a two-sided version. Indeed, if we introduce analogous quantities for a 
right fraction 

FMEiz) = '^riz)^rizy\ 

by taking $, (2;) as in (l47t and 

^r{z) := -L*zA{In- zA)-^Tr + D^^ 

D^^ = = -{CC*)-^C{C*L* - R)F„ 
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then these satisfy 

In the above we subscribe £, setting = and (^) = '^{z), to highlight "left functions" since 

^{z),'^{z) are the entries of the left fraction Fme{.z) = <l>(z)~^^(2;) of Fme{.z). □ 



Om 













(56) 



We begin with 



VIII. Multivariable "residues" and singular parts 

F{z) := FME{z)=Hzr'^{z) 

= D^ + Coz{In- zAoY^Bo 



(57) 



as given in ( BTT ). suppressing the subscript ''ME" for convenience. When R > 0, then $(2;) remains 
invertible in the closed unit disc and (l55t readily implies that 



Herm{F(e^'^)} = ^{e^^Y^n {^{e^^Y^Y . 



(58) 



cf. (l53t . But when R is singular, the variance of the minimal prediction error Vt is also singular (see 
Proposition |21 below) and (15^ may no longer be valid. The boundary limit of the Hermitian part defines 
a measure which may no longer be absolutely continuous. However, because F{z) is rational the singular 
part consists of finitely many disconinuities in jji{0). In order to separate the singular part from the 
absolutely continuous, we need to isolate the boundary poles of F{z). Accordingly, F{z) decomposes 
into a sum of "lossless" and "lossy" components — the lossless part being responsible for the singular part 
of the measure. 

In the case where F{z) G F is scalar-valued, the multiplicity of any pole 



iedB:={z : 



cannot exceed one and F{z) decomposes into 



romainme 



1} 



[z] with p > 0, 



where the first term is "lossless" and the second, -Fremaining(-2) G F, has no singularity at ^. Conformably, 

dlJi{9) = pdl]{9 - <0 + dflrcmmningiO) 

where <^ denotes the angle of C, (i.e., ^ = e-'^^) and dfi j.emainmg{0) is continuous at Thus, in general. 



F z) 



1 - z/^^ 



~l~ -^lossy(^) 



and the corresponding measure 



i=l 



Analogous facts hold true in the multivariable case with some exceptions. Singularities in R may not 
necessarily be associated with discontinuities in the measure and, while F(z) can have poles with higher 
multiplicity on the boundary of D, these may not have geometric multiplicity exceeding one. When F{z) 
has poles on the boundary, these are associated with discontinuities and our interest is to show how to 
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decompose F{z) into a lossless and a lossy part, in general, and thus isolate the singular part of the 
measure. 

We first discuss the significance of R being singular. With Ao, Bo, Co as in (15^ and T,n as in 
Proposition [T] it holds that 

R = BVIB* + AoRAl. (59) 

This can be verified directly (by careful algebra). It can also be shown via a limiting argument, replacing 
R, r, with RejFe, (/„ — BT^)A (as in the proof of Theorem |^ and invoking [12, equation (23)] to 
show that a similar identity holds for the perturbed quantities for all e > 0, hence for their limits as well. 
A direct consequence of ( l59b is the following. 

Proposition 2: Let A, B satisfy ^-6), R > 0, A, 5, R satisfy ©, and Vi the Hm-minimal value of 
gR subject to ^25^. If ^] > then R > 0. 

Proof: The pair {Ao, BVL^^"^) is a reachable pair since it is obtained from {A, B) after a state-feedback 
transformation and an invertible input tranformation. Then R must be the reachability Grammian from 
( I59t which cannot be singular. ■ 

Example 1: Elementary scalar examples suffice to demonstrate how singularities of R can give rise to 
poles of F(z) on dU). To see that this may not always be the case consider A, i? as in @ with n = A 
and m = 2, and let R which is now block- Toeplitz as in ([U) have entries 



Rn 



1 1 
1 1 



and Ri = -R 



Then 



r = [ /2 -\Ro ] andn = -Ro. 



Both R and Vt are singular while the eigenvalues of Ao are {0, 0, 0, |}. □ 
Next, we present some general facts about lossless rational matrices in F. If 

1 



then 



where 



Ds 

Cs 

B.. 



and As block diagonal with blocks of the form '1^ of size equal to the size of pn. Then Fs{s) E F 
but it is also lossless, which amounts to Herm {Fs(re-'^)} = a.e. on dB). It is a consequence of the 
Herglotz representation that, modulo a state transformation and an additive skew-Hermitian summand in 
D, any rational lossless function is necessarily of this form. An alternative characterization of lossless 
functions can be obtained via the well-known positive real lemma (e.g., [8]) which, for the case where 
the Hermitian part is to be identically zero, specializes to the following. 
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Proposition 3: A rational function D + Cz{In - zAy^B belongs to F and has Hermitian part 
identically equal to zero a.e. on the boundary of the unit circle if and only if there exists P > such 
that 



D 



P 

C* - 
D* - 



A* PA 
A*PB 
B*PB 





0. 



Proof: Nonnegativity of 



P - A* PA C* - A*PB 
C - B*PA D + D* - B*PB 



(60) 
(61) 
(62) 

(63) 



along with P > is equivalent to D + C z(In — zA) ^P G F by the positive real lemma (see [8, page 
70]). Now consider its Hermitian part 

[ B*z-'{h-z-'A*)-' ] 



X 



On C* 

C D + D* 



z{I„ 



-M)-ip 



and note that the null space of the mapping 

M ^ g{zyMg{z] 

where 



z{In - zA)-^B 



consists of matrices of the form 



P - A*PA 
-B*PA 



-A*PB 
-B*PB 



It readily follows that if conditions ( I60H62I) hold, then the function is lossless. If on the other hand 
do not hold and is simply nonnegative but not zero, then it can be shown that the Hermitian part 
can be factored into the product of nonzero spectral factors (cf. [8, page 125]). ■ 
Returning to dSTl . in case Ao has all its eigenvalues in the open disc D, then (l53t is valid and dSSt 
holds as well for all 9. In case Ag has eigenvalues on dB), we need to decompose F{z) into a lossless and 
a lossy summands. To do this, select Ti,T2 matrices whose vectors form bases for the eignespaces of A 
corresponding to eigenvalues on dH and those in the interior of the disc, respectively. Then Ao transforms 
into a block triangular matrix 



T-^AnT 



Ai 
A2 



where the spectrum of Ai is on the boundary and of A2 in the interior of the unit disc, respectively. The 
input and output matrices Pq, Co transform conformably into 



T-'Bo 
CoT 



B2 



and 



F{z) =D^ + Ciz{I - zAiY^Bi + C2z{I - zA^) 
Then we need to determine a value for a constant D\ so that 

Pi(2) = Pi + Ci2(/-zAi)-iPi 



-'B2. 
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is lossless. Necessarily, the remaining term Z^^ — Di + C2z{I — ^^2)^^-82 is in F and is devoid of 
singularities on the boundary. 

The transformation Ti above, can be chosen so that Ai is unitary, since Ai has only simple eigenvalues 
on (9D. Then condition (I^Dt leads to 

AiP = PAi 

and hence that P is a polynomial function of Ai, i.e., 

P = p{Ai) := Pol + piAi + ...+ pn,-iA'l'-\ 

rii being the size of Ai. The vector of coefficients [ po ... Pm-i ] can now be computed from (1611) 
which becomes 

A,C* = p{A^)B,. 

When m > 1, this is an overdetermined set of equations which necessarily has a solution. Finally, we 
may take 

= ^B*p{A,)B 

to satisfy (l62b and ensure that Fi{z) is lossless. The matricial residues which represent the discontinuities 
in dn(9) can now be computed by taking suitable limits at the singularities of Ai 

VepiV; = Herm{ lim (1 - ze^^')Fi{z)}, £ = 1,2, 



2;— >£■ 



Evidently, if Ai is first brought into a diagonal form, then a convenient closed expression for the limit 
can be given in terms of partitions of Bi, Ci corresponding to the eigenvalue e-'^^ 

IX. Impossibility of decomposition into white noise + deterministic part 

For the case of a scalar stochastic process {u^ ■ k E Z}, where m = 1, any state-covariance R can 
be written as 

-R' -R'signal -R-white noise 

where 

-R-white noise ClgR-O 

with Ro being the solution to the Lyapunov equation 

Ro - ARoA* = BB* 

and ao the smallest eigenvalue of the matrix pencil R — aRo, i.e., 

ao = min{a : det (R — aRo) = 0} (64) 
= max{a : R — a;Ro>0}. (65) 

The matrix Rq is the controllability Grammian of the pair {A, B) and represents the state-covariance 
when the input is unit-variance white noise. Then Rwhite noise represents the maximal summand of R that 
can be attributed to a white-noise input component of Q, while the remaining Rsignai corresponds to 
a deterministic input part. It can also be shown that this decomposition is canonical in the sense that 
any other one, consistent with a "white noise plus deterministic part" hypothesis for the input, will have 
a larger number of deterministic components (i.e., spectral lines). This is the interpretation of the CFP 
decomposition. The theory was originally developed for R's having a Toeplitz structure [15], [19] and 
extended to general state-co variances in [9], [10]. 

It is rather instructive to present a derivation of the fact that, when m = 1, the equivalent conditions 
(iii, iii-a) of Proposition [l] are automatically satisfied by any singular state-covariance. This underscores 
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the dichotomy with the multivariable case where a decomposition of R consistent with a "white noise 
plus deterministic part" input is not always possible (see Examples El and |3] below). 

Proposition 4: Let R, A, 5, i7 satisfy ^) in Theorem El let R > and singular, and let m = 1. 
Then i?*n^(R)i? is invertible. 



Proof: Suppose that 5*11^(^)5 is not invertible. Then 

nA/-(R)i? = 0„xl 



and 



(66) 
(67) 



n{B) c 7^(R). 

From (jSb) and (1^ it follows that n^(R,)y4Ry4*n^(R) = 0„xn, and hence, that 

nAr(R)^R = 0„xn- 

From (l67l) . n^(R)y4i? = 0„xi- By induction, using (jSj?), it follows that 

n^^(R)A^R = 0„xn, for £ = 0, 1, . . . 

and hence, that 7^(R) is A-invariant. But 7^(5) C 7?.(R) and so is the largest A-invariant subspace 
containing TZ{B). Because {A,B) is a reachable pair, 7?.(R) = C" which contradicts the hypothesis that 
R is singular. ■ 
The following example shows that the statement of the proposition is only valid when m = 1 and 
that, in general, a decomposition of R consistent with a "white noise plus deterministic part" input is not 
always possible. 



Example 2: Let 



A 



' 02 


O2 " 


to 




. ^2 


O2 





and 



R 



1 

1 

1/2 

3/4 1/2 



1/2 3/4 

1/2 

1 
1 



where, as usual, I2 and O2 are the 2x2 identity and zero matrices, respectively. It can be readily seen 
that they satisfy conditions @ as well as (ISJ?) in Theorem El — R being a block-Toeplitz matrix. Then 
R > and singular. To see this note that the first three principal minors of R are positive definite while 

(-2-112 ) R = Oix4- 
If the input to © is white noise with variance the 2 x 2 non-negative matrix 



Q 



a h 
b c 



then the state-covariance (for the chosen values of (A, B) and corresponding to this white-noise input) is 



Rn 



Q O2 
O2 Q 



Q. 



We claim that 



R — Rq ^ =^ Rq — 04x4- 

To prove this, consider that Q > from which we obtain 

ac > a > 0, c > 0. 



(68) 
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Now, ifv={-2 -1 1 2 ) then vRv' = and vRqv' > 0. Therefore 

R - Ro > 

vRqv' = 

=^ 5a + 43?e (6) + 5c = 0. (69) 

Thence, if P := (6), 

ac > 

25 

^ ac > — (a^ + c^ + 2ac) 
16 

2 2 34 
^ > a^ + c^ + — ac 

=^ either a = or c = 0. 

In either case, |6| = and hence all three a = 6 = c = from ( l69b . Thus, Q = 02x2 and Rq = O4X4 as 
claimed. □ 

While the previous example shows that no white noise component can be subtracted in the hope of 
reaching a state-co variance satisfying condition (iii) in Proposition [l] (thus corresonding to pure sinusoids), 
more is true. The following example shows that the off-diagonal block-entries of a block-Toeplitz R already 
prevent condition (iii) from being true. 

Example 3: Let A, B as in Example |2l and 



R = 



a 


b 


1/2 


3/4 


b 


c 








1/2 





a 


b 


3/4 





b 


c 



In order for condition (iii) of Proposition [T] to hold, the null space A/'(R) must have a dimension > 2 = 
dim(Jl(B)) (which can also readily seen from condition (iii-a) as well). We argue that this cannot happen. 
Since 

"a 3/4 " 
3/4 c 



is a principle minor of R > 0, neither a nor c can vanish. The rank of 



Rq :-- 



a b 
b c 



must be equal to one, since there is a 3 x 3 minor of R with determinant 



c X det(i?o)- 

Hence, ac = |6p =^ c = |6p/2a. But then, the northwest 3x3 principle minor of R is equal to — c/4 < 0, 
which contradicts R > 0. □ 
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X. Decomposition as a convex optimization problem 

We have just seen that in the case of a vectorial input, a decomposition of the state-covariance R of 
Q which is consistent with the hypothesis of "white noise plus a deterministic signal at the input" may 
not always be possible. We begin by choosing an alternative interpretation of the CFP decomposition as 
seeking to separate the maximal-variance white noise component at the input which is consistent with a 
known state-covariance. This is the analog of (l65t and leads to the following problem. 

Problem 1: Given R, A, B satisfying ©, R > 0, and (jS^) in Theorem |3 determine a decomposi- 
tion 

1^ -R'signal ~l~ -R-noise C^O) 

where the summands satisfy 

R-noise > 0, (71) 



R 



and 



Rn 



Rn 



- AR, 



■signal 

A* 



> 0, 



BQB* with Q > 0, 



argmaxjtrace R^ 



(|7T]-|73l) hold}. 



(72) 
(73) 

(74) 



This is a standard convex optimization problem where the noise variance trace Q is a linear functional 
of the parameters in Q and all constraints appear in the form of linear matrix inequalities. Thus, it can 
be readily and efficiently solved with existing computational tools. Alternatives to dT^t corresponding to 
a different "normalizations" are 



Rn 



argmax{trace (RnoiseW) : (l7T]-|73) hold}, (75) 

for any weight matrix W > (which may encapsulate "prior" information about the directionality of the 
noise), or to seek 

Q = argmax{trace Q : (1711- 17^ holdl. (76) 

Below we present an example which shows that a maximum-trace solution as above, in general, does 
not lead to a decomposition with Rsignai corresponding to a deterministic signal (i.e., satisfying (B^ ) even 
when an alternative decomposition does. 

Example 4: With A, B as in Example |2l consider the state-covariance 

ro n 1/2 3/4 
ri ro 1/2 
1/2 ro ri 
3/4 1/2 ri ro 

where the block-diagonal entries are yet unspecified. The values for these entries can be explicitly 
computed in the following two cases: 

(i) i?*n^(R)i? is invertible, and 

(ii) trace (R) is minimal, 
while always R > 0. 

The first can be carried out as follows. Condition (i) is equivalent to the existence of a matrix 


1 



R 



1 




7l,3 
72,3 



7l,4 
72,4 



such that FR is the zero matrix. Denote 

Rq := 



ri 



ri 



Ri 



1/2 




3/4 
1/2 



and 



7l,3 
72,3 



7l,4 
72,4 
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Since 



we deduce that 



Equation (TTTt leads to 



Ro + TqRI = 02 Ro + RiT* = 02, while 
Ri + To-Ro = O2, 

Ri — ro-RiTQ = O2, 

Ri — tIri = O2. 



(77) 
(78) 



Ri + RI = To{Ri + Rl)T^ 



and, if we factor Ri + RI = SS* with 







^ 3/4 ^l-d)' 

we deduce that S^^TqS must be unitary. Then from (ITSt we determine the eigenvalues of Tq. Carrying 
out all computations explicitely leads to 

1 



Rf) — — 
" 2 



and 



To 



cosW ^an(^) 
tan(6') 



cos(2e) 
cos(6») 

tan(^) 



cos(6») 



where sin(6') = |. The values in Rq is the unique set values for which (i) holds. 

Similarly, the computation of the state-covariance with minimal trace as in (ii) can be carried out 
explicitly to give 

_ r 3/4 1/2 

-'''0,min trace 1/2 3/4 

Finally, it is easy to check that Rq — -Ro,min trace is indefinite. □ 

XL Short-range correlation structure 

The rationale for the CFP decomposition has been re-cast in Problem [T] as seeking to extract the 
maximal variance that can be attributed to white-noise. In the case where R is block-Toeplitz as in 
dB^ this amounts to determining a block-diagonal matrix Rnoisc of maximal trace satisfying the required 
positivity constraints (I71H73I ). Yet, it is rarely the case in practice that a "white-noise" hypothesis is valid. 
Thus, we herein propose a new paradigm-a paradigm that also leads to a convex optimization problem 
and encompasses the above interpretation of the CFP decomposition as a special case. We seek to identify 
a maximal- variance summand which has a "short-range correlation structure" defined as follows: 

Definition 1: Given A, B satisfying © a state-covariance R of the system © has correlation range k 
if there exists a matrix H E C"^" so that 



H* 



B AB . 



A^B 



Ql 
Ql J 



(79) 



for suitable matrices Qo, • • • , Qk, such that 

R - ARA* 



BH + H*B* 



(80) 
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and 



Qo + zQi + ... + z'^Qk e ¥. 



(81) 



It is insightful to first consider the case where A, B are given as in © and the state-covariance structure 
is (£ + 1) X (£+ 1) block- Toeplitz. A block- Toeplitz matrix R has correlation range k if it is block-banded 
with all entries beyond the fcth one being zero and, most importantly, it remains a covariance matrix when 
extended with zero elements beyond the £th entry as well. This is equivalent to Rk+i = Rk+2 = ■ ■ ■ = 
Ri = . . . = Om being an admissible extension since already 

Ro + 2zRi + ... + 2z^Rk e F 

from (ISTT) because Qi = Ri (i = 1, . . . ,k) and Rq = QI + Qq. 

Example 5: The following elementary example helps illustrate the concept of bounded correlation range. 
Consider the Toeplitz matrix 



R 



1 
1/2 
1/3 



1/2 

1 
1/2 



1/3 
1/2 
1 



We seek a Toeplitz noise-covariance summand of maximal trace with correlation range 1, i.e., we seek 



Rn 



go qi 
qi qo qi 

qi qo 



so that R — Rnoisc > 0, and go + 2-2^1 G F. Since go + '2zqi is only of degree one, go + 2zqi G F if and 
only if |gi| < go. The solution turns out to be go = 2/3 and gi = 0.3097. 

Instead, if we sought Rnoisc diagonal corresponding to white noise, the answer would have been Rnoise = 
min{eig(R)} x I^. It can be easily checked that min{eig(R)} = 0.4402 < go. Thus, colored MA-noise 
allows a larger amount of energy to be accounted for. □ 

Problem [l] with condition (1731 replaced by 



Rn 



having correlation range k 



(82) 



is also a convex optimization problem. In general, the positive-real constraint (ISTT) can be expressed as a 
convex condition via the well-known positive-real lemma (e.g., see [8]), and the maximizer of the trace 
can be readily obtained with existing numerical tools (e.g., the Matlab LMI toolbox). 
In the case © has nontrivial dynamics, the right hand side of dSUt becomes 

BH + H*B* = A^BQIB* + ... + ABQIB* 
+B{Ql + Qo)5* + BQrB*A* + ... + BQkB*{A*)\ 

and R can be interpreted as the state covariance due to colored noise at the input with spectral density 

Qle-^' + . . . + Qle-^' + (Q* + Qo) + Qie'' + ■■■ + Qkc'"'. 



A detailed study on the potential of decomposition according to "correlation range" for high resolution 
spectral analysis will be presented in a forthcoming report. 
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XII. Concluding remarks 

The Caratheodory-Fejer-Pisarenko (CFP) decomposition underlies many subspace identification tech- 
niques in modern spectral analysis (such as MUSIC, ESPRIT, and their variants [19]). But in spite of 
its importance and its extensive appearance in many guises in the identification and signal processing 
literature, no multivariable analog had been proposed. Perhaps the reason can be sought in the fact that the 
exact analog of the CFP-decomposition does not exist. This realization led us to alternative interpretations 
of the CFP-decomposition, and the goal of this paper has been to explore such alternatives for a "signal 
plus noise" decomposition of covariances for multivariable processes. In the process we have found that 
(e.g., see Example |3] and Section 11X1) regardless of how much of the energy is accounted for by noise, the 
remaining energy, in general, cannot be accounted for by pure spectral lines only. The remaining energy 
necessarily corresponds to a singular covariance matrix and thus. Sections [VIII and IVIIII develop the needed 
theory to construct spectra for singular matrices. Finally Sections 1x1 and IXll develop certain alternatives 
to the CFP decomposition where we forgo the requirement that one part is completely deterministic, and 
allow instead that it has a long range correlation structure. 
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